{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {
    "collapsed": true
   },
   "source": [
    "This notebook contains code to train a fully connected deep neural network on MNIST. The principal changes from the previous notebook are:\n",
    "\n",
    "* We have switched from a linear classifier to a deep neural network.\n",
    "\n",
    "* We have added code to visualize the graph and summary data in TensorBoard.\n",
    "\n",
    "* We are using the AdamOptimizer instead of the vanilla GradientDescentOptimizer.\n",
    "\n",
    "* We are using Dropout.\n",
    "\n",
    "An important takeaway: notice the code to train the model is identical to the previous notebook, despite the more complex model.\n",
    "\n",
    "Experiment with this notebook by running the cells and uncommenting code when asked. At the end is a short exercise.\n",
    "\n",
    "When you've finished with this notebook, use TensorBoard to see the results.\n",
    "\n",
    "Although this is a simple model, we can achieve about >97% accuracy on MNIST. Before going any further about accuracy, I'd like to add that by far the most important thing you can spend your time on in machine learning is selecting good features for your problem, and carefully designing your experiment. Invest in those first, before experimenting with the model itself."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "collapsed": true
   },
   "outputs": [],
   "source": [
    "from __future__ import absolute_import\n",
    "from __future__ import division\n",
    "from __future__ import print_function\n",
    "\n",
    "import math\n",
    "import os\n",
    "\n",
    "import tensorflow as tf\n",
    "from tensorflow.examples.tutorials.mnist import input_data"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "collapsed": true
   },
   "outputs": [],
   "source": [
    "tf.reset_default_graph()\n",
    "sess = tf.Session()"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "collapsed": true
   },
   "outputs": [],
   "source": [
    "# tip: if you run into problems with TensorBoard\n",
    "# clear the contents of this directory, re-run this script\n",
    "# then restart TensorBoard to see the result\n",
    "LOGDIR = './graphs' "
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "mnist = input_data.read_data_sets('/tmp/data', one_hot=True)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "collapsed": true
   },
   "outputs": [],
   "source": [
    "# number of neurons in each hidden layer\n",
    "HIDDEN1_SIZE = 500\n",
    "HIDDEN2_SIZE = 250\n",
    "\n",
    "NUM_CLASSES = 10\n",
    "NUM_PIXELS = 28 * 28\n",
    "\n",
    "# experiment with the nubmer of training steps to \n",
    "# see the effect\n",
    "TRAIN_STEPS = 2000\n",
    "BATCH_SIZE = 100\n",
    "\n",
    "# we're using a different learning rate than the previous\n",
    "# notebook, and a new optimizer\n",
    "LEARNING_RATE = 0.001"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "collapsed": true
   },
   "outputs": [],
   "source": [
    "# Define inputs\n",
    "with tf.name_scope('input'):\n",
    "    images = tf.placeholder(tf.float32, [None, NUM_PIXELS], name=\"pixels\")\n",
    "    labels = tf.placeholder(tf.float32, [None, NUM_CLASSES], name=\"labels\")"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "collapsed": true
   },
   "outputs": [],
   "source": [
    "# Function to create a fully connected layer\n",
    "def fc_layer(input, size_out, name=\"fc\", activation=None):\n",
    "    with tf.name_scope(name):\n",
    "        size_in = int(input.shape[1])\n",
    "        w = tf.Variable(tf.truncated_normal([size_in, size_out], stddev=0.1), name=\"weights\")\n",
    "        b = tf.Variable(tf.constant(0.1, shape=[size_out]), name=\"bias\")\n",
    "        wx_plus_b = tf.matmul(input, w) + b\n",
    "        if activation: return activation(wx_plus_b)\n",
    "        return wx_plus_b\n",
    "    \n",
    "# The way we initialize variables has an affect on how quickly \n",
    "# training converges. We may explore with different strategies later.\n",
    "# w = tf.Variable(tf.truncated_normal(shape=[size_in, size_out], stddev=1.0 / math.sqrt(float(size_in))))"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "collapsed": true
   },
   "outputs": [],
   "source": [
    "# Define the model\n",
    "\n",
    "# First, we'll create two fully connected layers, with ReLU activations\n",
    "fc1 = fc_layer(images, HIDDEN1_SIZE, \"fc1\", activation=tf.nn.relu)\n",
    "fc2 = fc_layer(fc1, HIDDEN2_SIZE, \"fc2\", activation=tf.nn.relu)\n",
    "\n",
    "# Next, we'll apply Dropout to the second layer\n",
    "# This can help prevent overfitting, and I've added it here\n",
    "# for illustration. You can comment this out, if you like.\n",
    "dropped = tf.nn.dropout(fc2, keep_prob=0.9)\n",
    "\n",
    "# Finally, we'll calculate logists. This will be\n",
    "# the input to our Softmax function. Notice we \n",
    "# don't apply an activation at this layer.\n",
    "# If you've commented out the dropout layer,\n",
    "# switch the input here to 'fc2'.\n",
    "y = fc_layer(dropped, NUM_CLASSES, name=\"output\")"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "collapsed": true
   },
   "outputs": [],
   "source": [
    "# Define loss and an optimizer\n",
    "with tf.name_scope(\"loss\"):\n",
    "    loss = tf.reduce_mean(tf.nn.softmax_cross_entropy_with_logits(logits=y, labels=labels))\n",
    "    tf.summary.scalar('loss', loss)\n",
    "\n",
    "with tf.name_scope(\"optimizer\"):\n",
    "    # Whereas in the previous notebook we used a vanilla GradientDescentOptimizer\n",
    "    # here, we're using Adam. This is a single line of code change, and more\n",
    "    # importantly, TensorFlow will still automatically analyze our graph\n",
    "    # and determine how to adjust the variables to decrease the loss.\n",
    "    train = tf.train.AdamOptimizer(LEARNING_RATE).minimize(loss)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "collapsed": true
   },
   "outputs": [],
   "source": [
    "# Define evaluation\n",
    "with tf.name_scope(\"evaluation\"):\n",
    "    # these there lines are identical to the previous notebook.\n",
    "    correct_prediction = tf.equal(tf.argmax(y, 1), tf.argmax(labels, 1))\n",
    "    accuracy = tf.reduce_mean(tf.cast(correct_prediction, tf.float32))\n",
    "    tf.summary.scalar('accuracy', accuracy)    "
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "collapsed": true
   },
   "outputs": [],
   "source": [
    "# Set up logging.\n",
    "# We'll use a second FileWriter to summarize accuracy on\n",
    "# the test set. This will let us display it nicely in TensorBoard.\n",
    "train_writer = tf.summary.FileWriter(os.path.join(LOGDIR, \"train\"))\n",
    "train_writer.add_graph(sess.graph)\n",
    "test_writer = tf.summary.FileWriter(os.path.join(LOGDIR, \"test\"))\n",
    "summary_op = tf.summary.merge_all()"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "collapsed": true
   },
   "outputs": [],
   "source": [
    "sess.run(tf.global_variables_initializer())"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "for step in range(TRAIN_STEPS):\n",
    "    batch_xs, batch_ys = mnist.train.next_batch(BATCH_SIZE)\n",
    "    summary_result, _ = sess.run([summary_op, train], \n",
    "                                    feed_dict={images: batch_xs, labels: batch_ys})\n",
    "\n",
    "    train_writer.add_summary(summary_result, step)\n",
    "    train_writer.add_run_metadata(tf.RunMetadata(), 'step%03d' % step)\n",
    "    \n",
    "    # calculate accuracy on the test set, every 100 steps.\n",
    "    # we're using the entire test set here, so this will be a bit slow\n",
    "    if step % 100 == 0:\n",
    "        summary_result, acc = sess.run([summary_op, accuracy], \n",
    "                                       feed_dict={images: mnist.test.images, \n",
    "                                                  labels: mnist.test.labels})\n",
    "        test_writer.add_summary(summary_result, step)\n",
    "        test_writer.add_run_metadata(tf.RunMetadata(), 'step%03d' % step)\n",
    "        print (\"test accuracy: %f at step %d\" % (acc, step))\n",
    "\n",
    "\n",
    "print(\"Accuracy %f\" % sess.run(accuracy, \n",
    "                               feed_dict={images: mnist.test.images,\n",
    "                                          labels: mnist.test.labels}))\n",
    "train_writer.close()\n",
    "test_writer.close()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Exercise\n",
    "Add code a third hidden layer and visualize the result in TensorBoard."
   ]
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Python 2",
   "language": "python",
   "name": "python2"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 2
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython2",
   "version": "2.7.10"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 2
}
